Jul 25, 2014 7:21am
Latin texts mistaken for Russian
I often see Latin texts tagged as Russian by Google Books, for instance the just-imported https://archive.org/details/bub_gb_pPXFX1fM6gUC
Perhaps the error is caused by the ample Ancient Greek passages? Is it worth finding such books and reporting them to have language corrected?
Granted, this book is clearly a nightmare to OCR in any language, given:
* skewed lines,
* varying font sizes, weird diacritics and stacked letters,
* pages with bites or scratches,
* ink passing through the paper from the other side,
* "dirty" papers (perhaps Florence flood of 1966 is to blame too).
This post was modified by Nemo_bis on 2014-07-25 14:21:33